Goto

Collaborating Authors

 iqa dataset


Adaptive Image Quality Assessment via Teaching Large Multimodal Model to Compare

Neural Information Processing Systems

While recent advancements in large multimodal models (LMMs) have significantly improved their abilities in image quality assessment (IQA) relying on absolute quality rating, how to transfer reliable relative quality comparison outputs to continuous perceptual quality scores remains largely unexplored.


Adaptive Image Quality Assessment via Teaching Large Multimodal Model to Compare

Neural Information Processing Systems

While recent advancements in large multimodal models (LMMs) have significantly improved their abilities in image quality assessment (IQA) relying on absolute quality rating, how to transfer reliable relative quality comparison outputs to continuous perceptual quality scores remains largely unexplored. To address this gap, we introduce an all-around LMM-based NR-IQA model, which is capable of producing qualitatively comparative responses and effectively translating these discrete comparison outcomes into a continuous quality score. Specifically, during training, we present to generate scaled-up comparative instructions by comparing images from the same IQA dataset, allowing for more flexible integration of diverse IQA datasets. Utilizing the established large-scale training corpus, we develop a human-like visual quality comparator. During inference, moving beyond binary choices, we propose a soft comparison method that calculates the likelihood of the test image being preferred over multiple predefined anchor images. The quality score is further optimized by maximum a posteriori estimation with the resulting probability matrix. Extensive experiments on nine IQA datasets validate that the Compare2Score effectively bridges text-defined comparative levels during training with converted single image quality scores for inference, surpassing state-of-the-art IQA models across diverse scenarios. Moreover, we verify that the probability-matrix-based inference conversion not only improves the rating accuracy of Compare2Score but also zero-shot general-purpose LMMs, suggesting its intrinsic effectiveness.


Adaptive Image Quality Assessment via Teaching Large Multimodal Model to Compare

Neural Information Processing Systems

While recent advancements in large multimodal models (LMMs) have significantly improved their abilities in image quality assessment (IQA) relying on absolute quality rating, how to transfer reliable relative quality comparison outputs to continuous perceptual quality scores remains largely unexplored.


Adaptive Image Quality Assessment via Teaching Large Multimodal Model to Compare

Neural Information Processing Systems

While recent advancements in large multimodal models (LMMs) have significantly improved their abilities in image quality assessment (IQA) relying on absolute quality rating, how to transfer reliable relative quality comparison outputs to continuous perceptual quality scores remains largely unexplored. To address this gap, we introduce an all-around LMM-based NR-IQA model, which is capable of producing qualitatively comparative responses and effectively translating these discrete comparison outcomes into a continuous quality score. Specifically, during training, we present to generate scaled-up comparative instructions by comparing images from the same IQA dataset, allowing for more flexible integration of diverse IQA datasets. Utilizing the established large-scale training corpus, we develop a human-like visual quality comparator. During inference, moving beyond binary choices, we propose a soft comparison method that calculates the likelihood of the test image being preferred over multiple predefined anchor images.


The First Comprehensive Dataset with Multiple Distortion Types for Visual Just-Noticeable Differences

Liu, Yaxuan, Jin, Jian, Xue, Yuan, Lin, Weisi

arXiv.org Artificial Intelligence

Recently, with the development of deep learning, a number of Just Noticeable Difference (JND) datasets have been built for JND modeling. However, all the existing JND datasets only label the JND points based on the level of compression distortion. Hence, JND models learned from such datasets can only be used for image/video compression. As known, JND is a major characteristic of the human visual system (HVS), which reflects the maximum visual distortion that the HVS can tolerate. Hence, a generalized JND modeling should take more kinds of distortion types into account. To benefit JND modeling, this work establishes a generalized JND dataset with a coarse-to-fine JND selection, which contains 106 source images and 1,642 JND maps, covering 25 distortion types. To this end, we proposed a coarse JND candidate selection scheme to select the distorted images from the existing Image Quality Assessment (IQA) datasets as JND candidates instead of generating JND maps ourselves. Then, a fine JND selection is carried out on the JND candidates with a crowdsourced subjective assessment.